High Rank Matrix Completion with Side Information

نویسندگان

  • Yugang Wang
  • Ehsan Elhamifar
چکیده

We address the problem of high-rank matrix completion with side information. In contrast to existing work dealing with side information, which assume that the data matrix is low-rank, we consider the more general scenario where the columns of the data matrix are drawn from a union of lowdimensional subspaces, which can lead to a high rank matrix. Our goal is to complete the matrix while taking advantage of the side information. To do so, we use the self-expressive property of the data, searching for a sparse representation of each column of matrix as a combination of a few other columns. More specifically, we propose a factorization of the data matrix as the product of side information matrices with an unknown interaction matrix, under which each column of the data matrix can be reconstructed using a sparse combination of other columns. As our proposed optimization, searching for missing entries and sparse coefficients, is non-convex and NP-hard, we propose a lifting framework, where we couple sparse coefficients and missing values and define an equivalent optimization that is amenable to convex relaxation. We also propose a fast implementation of our convex framework using a Linearized Alternating Direction Method. By extensive experiments on both synthetic and real data, and, in particular, by studying the problem of multi-label learning, we demonstrate that our method outperforms existing techniques in both low-rank and high-rank data regimes. Introduction Matrix completion, which is the problem of estimating missing entries of an incomplete matrix, is a fundamental task in machine learning with numerous applications, including, collaborative filtering for recommender systems (Rennie and Srebro 2005; Sindhwani et al. 2010), multi-label learning (Natarajan and Dhillon 2014; Xu, Jin, and Zhou 2013; Argyriou, Evgeniou, and Pontil 2008), semi-supervised clustering (Chiang et al. 2014) and global positioning (Singer and Cucuringu 2010; Singer 2008; Biswas et al. 2006). Existing algorithms that deal with missing entries in data can be divided into two main categories. The first group of algorithms, such as Probabilistic PCA (PPCA) (Tipping and Bishop 1999b), Factor Analysis (FA) (Knott and Bartholomew 1999) and Convex Low-Rank Matrix Completion (Candes and Recht 2009; Keshavan, Montanari, and Oh Copyright c 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. 2010; Chen et al. 2011; Chiang, Hsieh, and Dhillon 2015), assumes that data lie in a single low-dimensional subspace and try to recover a completion of the data that has a minimum or a small fixed rank. The second group of algorithms, including Mixture of Probabilistic PCA (MPPCA) (Tipping and Bishop 1999a; Gruber and Weiss 2004), Mixture of Factor Analyzers (MFA) (Ghahramani, Hinton, and others 1996), K-GROUSE (Balzano et al. 2012), SSC-Lifting (Elhamifar 2016) and (Eriksson, Balzano, and Nowak 2012), addresses the more general and challenging scenario where data lie in a union of low-dimensional subspaces. The goals in this case are to recover missing entries and cluster data according to subspaces. The union of subspaces models many real-world problems, including motion and activity segmentation in videos, recommender systems and multilabel learning, where there exists multiple groups in data corresponding to different classes or categories, where each group is modeled by a single subspace. Since the union of low-dimensional subspaces is often high/full-rank, methods in the first category are not effective for data completion. Matrix Completion with Side Information. In many realworld problems, we have access to additional information about the entries of the data matrix, referred to as side information, which can guide the matrix completion in order to obtain more accurate solutions (Adams, Dahl, and Murray 2010; Agarwal and Chen 2009; Menon et al. 2011a; Porteous, Asuncion, and Welling 2010). For example, in the classical Netflix problem, which aims to predict the unobserved entries of a users-movies rating matrix, besides the rating history, we have access to information/features of users, such as age, gender, etc., as well as information/features of movies, such as suspense, science fiction, etc. Also, in the multi-label learning problem, whose goal is to find all relevant labels to each sample, in addition to the incomplete observed labels, the features describing instances are often given at the same time. Indeed, such side information can be leveraged in matrix completion for better recovery performance, especially, when very few matrix entries are observed. Despite its significance, the problem of matrix completion with side information has only been recently studied, where all existing techniques have addressed the problem when the data matrix is low-rank (Goldberg et al. 2010; Menon et al. 2011b; Natarajan and Dhillon 2014; Jain and Dhillon 2013; Xu, Jin, and Zhou 2013; Chiang, Hsieh, and Dhillon 2015; Lu et al. 2016; Chiang, Hsieh, and Dhillon 2016; Liu and Li 2016). The methods in (Menon et al. 2011b; Natarajan and Dhillon 2014) cast the problem as finding a factorization of data matrix as the inner product of side information features with the product of two unknown matrices that must be recovered simultaneously, and employ a nonconvex algorithm to recover the unknowns. Although experimental results have shown favorable results, the proposed methods rely on non-convex programming and depend on good initialization. The methods in (Jain and Dhillon 2013; Xu, Jin, and Zhou 2013) use side information feature matrices, Fc and Fr, for the rows and columns of the data matrix, Y , in a so called Inductive Matrix Completion (IMC) framework. More specifically, assuming that all columns and rows of the data matrix lie in spaces spanned, respectively, by the column vectors in Fc and Fr, (Jain and Dhillon 2013; Xu, Jin, and Zhou 2013) consider a factorization of Y as Y = FrQF > c for an unknown low-rank inductive matrix Q and try to complete the data by finding Q. While (Jain and Dhillon 2013) uses a low-rank matrix factorization, (Xu, Jin, and Zhou 2013) proposes a method, referred to as Maxide, that directly minimizes the rank of Q based on a singular value thresholding algorithm. The work in (Chiang, Hsieh, and Dhillon 2015) considers an extension of the IMC framework, referred to as DirtyIMC, to addresses the problem of matrix completion with noisy side information. More specifically, it considers the model Y = FrQF c +R, where the residual matrix R is used to capture the component of the data that the side information cannot describe, and requires both Q and R to be low-rank, hence assuming a low-rank data matrix, Y . The work in (Lu et al. 2016) considers a modification of IMC by assuming that the inductive matrix Q is sparse, instead of low-rank, to deal with the situation that Q in not necessarily low-rank. Hence, it minimizes the rank of Y while minimizing the sparsity of Q. We refer to this method as Sparse Interactive Model (SIM) in our paper. It is important to note that all the above work, which address the problem of matrix completion with side information, consider the setting where the data matrix is low-rank. On the other hand, as discussed earlier, in many real-world problems, the data matrix columns or rows lie in a union of low-dimensional subspaces which leads to a high-rank data matrix. As a result, existing techniques will not be effective, as we will demonstrate in our experiments. Paper Contributions. In this paper, we address the challenging and general problem of high-rank matrix completion with side information. Building on (Elhamifar and Vidal 2013; 2009), we assume that each column of the data matrix can be efficiently represented as a sparse combination of a few other columns, which holds for both a single subspace as well as a union of subspaces. We cast the problem as recovering the missing entries and sparse representation coefficients, while taking advantage of the side information to complete the data matrix. More specifically, we propose a factorization of the data into a product of side information matrices with an unknown interaction matrix, under which each column of the data matrix can be reconstructed using a sparse combination of its other columns. As our proposed formulation is non-convex and NP-hard, building on (Elhamifar 2016), we propose a lifting framework, where we couple sparse coefficients and missing values and define an equivalent optimization that is amenable to convex relaxation. We derive a convex optimization and propose an efficient implementation of our framework using a Linearized Alternating Direction Method (LADM) (Yang and Yuan 2013), which is significantly faster than standard alternating direction methods, hence, allowing to efficiently deal with large data and high percentage of missing entries. Finally, by extensive experiments on synthetic and real data, in particular, by studying the problem of multi-label learning, we demonstrate that our method outperforms existing techniques in both low-rank and high-rank data regimes. High-Rank Matrix Completion with Side Information In this section, we propose a method to address the problem of high-rank matrix completion with side information. Assume that we are given a data matrix Y 2 Rn⇥N , which is partially observed, where ⌦ and ⌦c denote, respectively, the set of indices of observed and missing entries of Y . Assume that every row and column of Y is associated with an observed feature vector, providing side information. Let Fr 2 Rn⇥kr and Fc 2 RN⇥kc denote side information matrices of feature vectors associated, respectively, with the rows and columns of Y . We refer to Fr and Fc as row and column space side information matrices, as they provide additional information about the relationships between entries of the data matrix, which we will use for data completion. In this paper, we consider a general high/full rank model for Y by assuming that the columns (or similarly rows) of Y lie in a union of low-dimensional subspaces. Our goal is to find missing entries of Y , while taking advantage of the side information and respecting the underlying model of the data matrix. While the problem of matrix completion with side information has been studied before (Goldberg et al. 2010; Menon et al. 2011b; Natarajan and Dhillon 2014; Jain and Dhillon 2013; Xu, Jin, and Zhou 2013; Chiang, Hsieh, and Dhillon 2015; Lu et al. 2016; Chiang, Hsieh, and Dhillon 2016; Liu and Li 2016), all existing research has focused on the case where the data matrix is low-rank. On the other hand, in this paper, we study and address the more challenging problem of high-rank matrix completion with side information, which covers the low-rank setting as a special case. To tackle the problem, similar to all conventional methods (Natarajan and Dhillon 2014; Jain and Dhillon 2013; Xu, Jin, and Zhou 2013; Chiang, Hsieh, and Dhillon 2015; Lu et al. 2016; Chiang, Hsieh, and Dhillon 2016; Liu and Li 2016), we assume that the columns and rows of Y lie in spaces spanned by the columns of Fr and Fc, respectively. Thus, we can write Y = FrQF c , where Q 2 Rkr⇥kc is a unknown interaction matrix. Let ̄ Y 2 Rn⇥N denote the zero-filled data matrix, where the missing entries are filled with zeros. Our goal is to find the complete matrix Y , so that it can be decomposed as Y = FrQF c and the entries of Y indexed by ⌦ coincide with the entries of ̄ Y , i.e., R⌦(Y ) = R⌦( ̄ Y ). The function R⌦(Y ) returns a matrix whose (i, j)-th entry is Yi,j when (i, j) 2 ⌦ and is 0 otherwise. When the number of observed entries is small, in particular when |⌦| < krkc, without priors on Q and Y the problem has infinitely many solutions for unknown variables. Thus, in addition to utilizing side information, we need to impose appropriate priors on the unknown parameters and use the underlying structure of the data in order to perform data completion. To take advantage of the fact that the columns of the data matrix Y lie in a union of subspaces, we use the Self-Expressive Model (SEM) (Elhamifar and Vidal 2013; Elhamifar 2016), which states that each column of the data matrix can be written as a sparse representation of the other columns, hence, Y = Y C, where C 2 RN⇥N denotes the self-representation coefficient matrix. In addition, we need to impose diag(C) = 0 to remove the trivial solution of writing each point as a combination of itself. Notice that the SEM covers both a single and a union of low-dimensional subspaces, since in each subspace of dimension d, every point can be written as a combination of only d other points, in general positions, from the same subspace. Thus, to solve the problem of high-rank matrix completion with side information, we propose to solve

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speedup Matrix Completion with Side Information: Application to Multi-Label Learning

In standard matrix completion theory, it is required to have at least O(n ln n) observed entries to perfectly recover a low-rank matrix M of size n × n, leading to a large number of observations when n is large. In many real tasks, side information in addition to the observed entries is often available. In this work, we develop a novel theory of matrix completion that explicitly explore the sid...

متن کامل

Convex Co-Embedding for Matrix Completion with Predictive Side Information

Matrix completion as a common problem in many application domains has received increasing attention in the machine learning community. Previous matrix completion methods have mostly focused on exploiting the matrix low-rank property to recover missing entries. Recently, it has been noticed that side information that describes the matrix items can help to improve the matrix completion performanc...

متن کامل

A Sparse Interactive Model for Matrix Completion with Side Information

Matrix completion methods can benefit from side information besides the partially observed matrix. The use of side features that describe the row and column entities of a matrix has been shown to reduce the sample complexity for completing the matrix. We propose a novel sparse formulation that explicitly models the interaction between the row and column side features to approximate the matrix e...

متن کامل

Provable Inductive Matrix Completion

Consider a movie recommendation system where apart from the ratings information, side information such as user’s age or movie’s genre is also available. Unlike standard matrix completion, in this setting one should be able to predict inductively on new users/movies. In this paper, we study the problem of inductive matrix completion in the exact recovery setting. That is, we assume that the rati...

متن کامل

Graph Matrix Completion in Presence of Outliers

Matrix completion problem has gathered a lot of attention in recent years. In the matrix completion problem, the goal is to recover a low-rank matrix from a subset of its entries. The graph matrix completion was introduced based on the fact that the relation between rows (or columns) of a matrix can be modeled as a graph structure. The graph matrix completion problem is formulated by adding the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018